Longest Common Subsequence from Fragmentsvia Sparse Dynamic
نویسنده
چکیده
Sparse Dynamic Programming has emerged as an essential tool for the design of eecient algorithms for optimization problems coming from such diverse areas as Computer Science, Computational Biology and Speech Recognition 7, 11, 15]. We provide a new Sparse Dynamic Programming technique that extends the Hunt-Szymanski 2, 9, 8] paradigm for the computation of the Longest Common Subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, resp.) and a set M of matching substrings of X and Y , nd the longest common subse-quence based only on the symbol correspondences induced by the sub-strings. This problem arises in an application to analysis of software systems. Our algorithm solves the problem in O(jMj log jM j) time using balanced trees, or O(jMj log log min(jMj; nm=jMj)) time using John-son's version of Flat Trees 10]. These bounds apply for two cost measures. The algorithm can also be adapted to nding the usual LCS in O((m + n) log jj + jM j log jM j) using balanced trees or O((m + n) log jj + jM j log log min(jMj; nm=jMj)) using Johnson's Flat Trees, where M is the set of maximal matches between substrings of X and Y and is the alphabet.
منابع مشابه
Efficient algorithms for the longest common subsequence in $k$-length substrings
Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of...
متن کاملSparse Dynamic Programming for Longest Common Subsequence from Fragments
Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as computer science, computational biology, and speech recognition. We provide a new sparse dynamic programming technique that extends the Hunt–Szymanski paradigm for the computation of the longest common subsequence (LCS) and apply it to so...
متن کاملNew Tabulation and Sparse Dynamic Programming Based Techniques for Sequence Similarity Problems
Calculating the length of a longest common subsequence (LCS) of two strings A and B of length n andm is a classic research topic, with many worst-case oriented results known. We present two algorithms for LCS length calculation with respectively O(mn log log n/ log n) and O(mn/ log n+r) time complexity, the latter working for r = o(mn/(log n log log n)), where r is the number of matches in the ...
متن کاملA simple algorithm for the constrained sequence problems
In this paper we address the constrained longest common subsequence problem. Given two sequences X , Y and a constrained sequence P , a sequence Z is a constrained longest common subsequence for X and Y with respect to P if Z is the longest subsequence of X and Y such that P is a subsequence of Z. Recently, Tsai [7] proposed an O(n ·m · r) time algorithm to solve this problem using dynamic prog...
متن کاملThe constrained longest common subsequence problem
This paper considers a constrained version of longest common subsequence problem for two strings. Given strings S1, S2 and P , the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 such that P is a subsequence of this lcs. An O(rn 2m2) time algorithm based upon the dynamic programming technique is proposed for...
متن کامل